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TITLE OF THE INVENTION 
WORD RECOGNITION METHOD AND STORAGE MEDIUM THAT STORES 
WORD RECOGNITION PROGRAM 

CROSS-REFERENCE TO RELATED APPLICATIONS 
This application is based upon and claims the 
benefit of priority from the prior Japanese Patent 
Application No. 2000-020300, filed January 28, 2000, 
the entire contents of which are incorporated herein 
by reference. 

BACKGROUND OF THE INVENTION 

The present invention relates to a word recogni- 
tion method for performing word recognition in an 
optical character reader for optically reading a word 
that consists of a plurality of characters described 
on a material targeted for reading. In addition, the 
present invention relates to a storage medium that 
stores a word recognition program for causing the word 
recognition processing. 

In general, in an optical character reader, for 
example, in the case where characters described on 
a material targeted for reading is read, even if 
individual character recognition precision is low, one 
can read such characters precisely by using knowledge 
of words. Conventionally, a variety of methods have 
been proposed. 

For example, in the invention disclosed in Jpn. 
Pat. Appln. KOKAI Publication No. 10-177624, a distance 
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(the smaller value of the distance is, the more 
reliable recognition result is.) is used as a result of 
character recognition, and an evaluation value of words 
is obtained by summation of these distances . 
5 In addition, in the invention disclosed in Jpn. 

Pat. Appln. KOKAI Publication No. 8-167008, candidates 
of words are narrowed at the stage of character 
recognition, correlation between each of such narrowed 
candidates and each word is performed, and an 
10 evaluation value of words is obtained with the number 

of coincident characters. 

Further, in disclosure of Japanese Electronics & 
Communications Society Paper Vol., 52-C, No. 6, June 
1969, pages 305 to 312, a posteriori probability is 
15 used as an evaluation value of words. 

The posteriori probability will be described here. 

A probability at which an event (b) occurs is 
expressed as P (b). 

A probability at which an event (b) occurs after 
20 an event (a) has occurs is expressed as P (b | a). 

A case in which the event (b) occurs irrespective of 
whether or not the event (a) occurs, P (b | a) is the 
same as P (b). In contrast, a probability at which the 
event (b) occurs under the influence of the event (a) 
2 5 after the event (a) has occurred is referred to as 

posteriori probability, and is expressed as P (b | a) . 

However, any of these conventional methods is 
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meaningful only when the number of characters in a word 
is constant. If the number of characters is not 
constant, these methods cannot be used. Even if they 
are used, a failure will occur. That is, in the 
5 invention disclosed in Jpn. Pat. Appln. KOKAI 

Publication No. 10-177624, the smaller number of 
characters is, the smaller evaluation value is. Thus, 
a word with less characters is prone to be selected. 
In addition, in the invention disclosed in Jpn. 

10 Pat. Appln. KOKAI Publication No. 8-167008 and in the 

disclosure of Japanese Electronics & Communications 
Society paper, it is presumed that the number of 
characters is constant. When the number of characters 
is not constant, they cannot be used. 

15 Further, a conventional evaluation function for 

word recognition fails to consider the ambiguity of 
word delimiting, the absence of character spacing, 
noise entry, and the ambiguity of character delimiting. 
BRIEF SUMMARY OF THE INVENTION 

20 It is an object of the present invention to 

provide a word recognition method and storage medium 
that stores a word recognition program, capable of 
performing word recognition precisely even in the case 
where the number of characters in a word is not 

25 constant. 

It is another object of the present invention to 
provide a word recognition method and storage medium 
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that stores a word recognition program, capable of 
performing word recognition precisely even in the case 
where word delimiting is not reliable. 

It is still another object of the present 
5 invention to provide a word recognition method and 

storage medium that stores a word recognition program, 
capable of performing word recognition precisely even 
in the case where no character spacing is provided or 
noise entry occurs. 

10 It is a further object of the present invention to 

provide a word recognition method and storage medium 
that stores a word recognition program, capable of 
performing word recognition precisely even in the case 
where character delimiting is not reliable. 

15 A word recognition method according to the present 

invention comprises: a character recognition processing 
step of performing recognition processing of an input 
character string that corresponds to a word to be 
recognized by each character to obtain a result of 

2 0 character recognition; a probability calculation 

step of conditioning characters of words in a word 
dictionary that stores in advance candidates of words 
to be recognized, and then, obtaining a probability 
at which there appear characteristics obtained as 

25 a result of character recognition in accordance with 

the character recognition processing step; a first 
computation step of performing a predetermined first 
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computation between a probability obtained in 
accordance with this probability calculation step and 
a probability at which there appear the characteristics 
obtained as the result of character recognition in 
5 accordance with the character recognition processing 

step; a second computation step of performing 
a predetermined second computation between the 
computation results obtained by the first computation 
for characters of words contained in the word 

10 dictionary; and a word recognition processing step of 

obtaining the recognition results of the words based on 
the second computation result obtained by this second 
computation step. 

In addition, the word recognition method according 

15 to the present invention comprises: a delimiting step 

of isolating an input character string that corresponds 
to a word to be recognized by each character; a step 
of obtaining plural kinds of delimiting results 
considering whether or not character spacing is 

20 provided; a character recognition processing step of 

performing recognition processing for characters 
relevant to all the delimiting results obtained in 
accordance with this step; a probability calculation 
step of conditioning characters of words contained in 

25 a word dictionary that stores in advance candidates 

of words to be recognized, and then, obtaining a 
probability at which there appear characteristics 
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obtained as a result of character recognition in 
accordance with the character recognition processing 
Step; a first computation step of performing a 
predetermined first computation between a probability 
5 obtained in accordance with the probability calculation 

step and a probability at which there appear the 
characteristics obtained as the result of character 
recognition in accordance with the character recogni- 
tion processing step; a second computation step of 

10 performing a predetermined second computation between 

computation results obtained by the first computation 
relevant to characters of words contained in the word 
dictionary; and a word recognition processing step of 
obtaining the recognition results of the words based on 

15 the result of the second computation in accordance with 

the second computation step. 

In addition, a storage medium according to the 
present invention is a computer readable storage medium 
that stores a word recognition program for causing 

2 0 a computer to perform word recognition processing, 

the word recognition program comprising; a character 
recognition processing step of performing recognition 
processing of an input character string that 
corresponds to a word to be recognized by each 

25 character; a probability calculation step of 

conditioning characters of words contained in a word 
dictionary that stores in advance candidates of words 
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to be recognized, and then, obtaining a probability at 
which there appear the characteristics obtained as 
a result of character recognition in accordance with 
the character recognition processing step; a first 
5 computation step of performing a predetermined first 

computation between a probability obtained in 
accordance with this probability calculation step and 
a probability that there appear the characteristics 
obtained as the result of character recognition in 

10 accordance with the character recognition processing 

step; a second computation step of performing a prede- 
termined second computation between the computation 
results obtained by the first computation for 
characters of words contained in the word dictionary; 

15 and a word recognition processing step of obtaining the 

recognition results of the words based on the second 
computation result obtained by this second computation 
step. 

According to the present invention, in word 
20 recognition using the character recognition result, 

an evaluation function is used based on a posteriori 
probability that can be used even in the case where the 
number of characters in a word is not always constant. 
In this way, even in the case where the number of 
25 characters in a word is not constant, word recognition 

can be performed precisely. 

In addition, according to the present invention. 



in word recognition using the character recognition 
result, an evaluation function is used based on 
a posteriori probability considering at least the 
ambiguity of word delimiting. In this way, even if 
5 the word delimiting is not reliable, word recognition 

can be performed precisely. 

In addition, according to the present invention, 
in word recognition using the character recognition 
result, an evaluation function is used based on a 

10 posteriori probability considering at least the fact 

that no character spacing is provided. In this way, 
even in the case where no character spacing is 
provided, word recognition can be performed precisely. 
In addition, according to the present invention, 

15 in word recognition using the character recognition 

result, an evaluation function is used based on 
a posteriori probability considering at least noise 
entry. In this way, even if noise entry occurs, 
word recognition can be performed precisely. 

20 Further, according to the present invention, 

in word recognition using the character recognition 
result, an evaluation function is used based on 
a posteriori probability considering at least the 
ambiguity of character delimiting. In this way, 

2 5 even if character delimiting is not reliable, word 

recognition can be performed precisely. 

Additional objects and advantages of the invention 
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will be set forth in the description which follows, and 
in part will be obvious from the description, or may- 
be learned by practice of the invention. The objects 
and advantages of the invention may be realized and 
5 obtained by means of the instrumentalities and 

combinations particularly pointed out hereinafter. 
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING 
The accompanying drawings, which are incorporated 
in and constitute a part of the specification, 
10 illustrate presently preferred embodiments of the 

invention, and together with the general description 
given above and the detailed description of the 
preferred embodiments given below, serve to explain 
the principles of the invention. 
15 FIG. 1 is a block diagram schematically depicting 

a configuration of a word recognition system for 
achieving a word recognition method according to 
an embodiment of the present invention; 

FIG. 2 is a view showing a description example of 
2 0 a mail on which an address is described; 

FIG. 3 is a flow chart illustrating an outline of 
the word recognition method; 

FIG. 4 is a view showing a character pattern 
identified as a city name; 
25 FIG. 5 is a view showing the contents of a word 

dictionary; 

FIG. 6 and FIG. 7 are views showing the contents 
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of a probability table; 

FIG. 8 is a view showing a description example of 
a mail on which an address is described; 

FIG. 9 is a view showing a character pattern 
5 identified as a city name; 

FIG. 10 is a view showing the contents of a word 
dictionary; 

FIG. 11 is a view showing the contents of 
a probability table; 
10 FIG. 12 is a view showing a description example of 

a mail on which an address is described; 

FIG. 13 is a view showing a character pattern 
identified as a city name; 

FIG. 14A to FIG. 14C are views showing the 
15 contents of a word dictionary; 

FIG. 15 is a view showing a set of categories 
relevant to the word dictionary shown in FIG. 14A to 
FIG. 14C; 

FIG. 16 is a view showing a description example of 
20 a mail on which an address is described; 

FIG. 17 is a view showing a character pattern 
identified as a city name; 

FIG. 18 is a view showing the contents of a word 
dictionary; 

25 FIG. 19 is a view showing a set of categories 

relevant to the word dictionary shown in FIG. 18; 
FIG. 20 is a view showing cells processed as 
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representing a city name; 

FIG. 21A to FIG. 2 ID are views showing a character 
delimiting pattern candidate; 

FIG. 22 is a view showing the contents of a word 
5 dictionary; 

FIG. 23A to FIG. 2 3D are views showing a set of 
categories relevant to the word dictionary shown in 
FIG. 22; 

FIG. 24 is a view showing the recognition result 
10 of each unit relevant to the character delimiting 

pattern candidate; and 

FIG. 25 is a view showing characteristics of 
character intervals. 

DETAILED DESCRIPTION OF THE INVENTION 

15 Hereinafter, preferred embodiments of the present 

invention will be described with reference to the 

accompanying drawings . 

FIG. 1 schematically depicts a configuration of 

a word recognition system for achieving a word 
2 0 recognition method according to an embodiment of the 

present invention. 

In FIG. 1, this word recognition system is 

composed of: a CPU (central processing unit) 1; an 

input device 2; a scanner 3 that is image input means; 
25 a display device 4; a first memory 5 that is storage 

means ; a second memory 6 that is storage means ; and 

a reader 7 . 
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The CPU 1 executes an operating system program 
stored in the second memory 6 and an application 
program (word recognition program or the like) stored 
in the second memory 6, thereby performing word 
5 recognition processing as described later in detail. 

The input device 2 consists of a keyboard and 
a mouse, for example, and is used for a user to perform 
a variety of operations or input a variety of data. 

The scanner 3 reads characters of a word described 
10 on a material targeted for reading through scanning, 

and inputs these characters. The above material 
targeted for reading includes a mail P on which 
an address is described, for example. In a method 
of describing the above address, as shown in FIG. 2, 
15 postal number, name of state, city name, street name, 

and street number are described in order from the 
lowest line and from the right side. 

The display device 4 consists of a display unit 
and a printer, for example, and outputs a variety of 
20 data. 

The first memory 5 is composed of a RAM (random 
access memory), for example. This memory is used as 
a work memory of the CPU 1, and temporarily stores 
a variety of data or the like being processed. 
2 5 The second memory 6 is composed of a hard disk 

unit, for example, and stores a variety of programs or 
the like for operating the CPU 1 . The second memory 6 
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stores: an operating system program for operating the 
input device 2, scanner 3, display device 4, first 
memory 5 , and reader 7 ; a word recognition program and 
a character dictionary 9 for recognizing characters 
5 that configure a word; a word dictionary 10 for word 

recognition; and a probability table 11 that stores 
a probability of the generation of characters that 
configure a word or the like. The above word 
dictionary 10 stores in advance a plurality of 

10 candidates of words to be recognized. This dictionary 

can be used as a city name dictionary that registers 
regions in which word recognition systems are 
installed, for example, city names in states. 

The reader 7 consists of a CD-ROM drive unit or 

15 the like, for example, and reads a word recognition 

program stored in a CD-ROM 8 that is a storage medium 
and a word dictionary 10 for word recognition. The 
word recognition program, character dictionary 9, word 
dictionary 10, and probability table 1 read by the 

2 0 reader 7 are stored in the second memory 6. 

Now, an outline of a word recognition method will 
be described with reference to a flow chart shown in 
FIG. 3. 

First, image acquisition processing for acquiring 
2 5 (reading) an image of a mail P is performed by means of 

the scanner 3 (STl). Region detection processing for 
detecting a region in which an address is described is 
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performed by using the image acquired by the image 
acquisition processing (ST2). There is performed 
delimiting processing for using vertical projection or 
horizontal projection, thereby identifying a character 
5 pattern in a rectangular region for each character of 

a word that corresponds to a city name, from a descrip- 
tion region of the address detected by the region 
detection processing (ST3). Character recognition 
processing for acquiring a character recognition 

10 candidate is performed based on a degree of analogy 

obtained by comparing a character pattern of each 
character of the word identified by this delimiting 
processing with a character pattern stored in the 
character dictionary 9 (ST4). By using the recognition 

15 result of each character of the word obtained by this 

character recognition processing; each of characters of 
the city names stored in the word dictionary 10; and 
the probability table 11, the posteriori probability 
is calculated by each city name contained in the word 

2 0 dictionary 10, and there is performed word recognition 

processing in which a word with its highest posteriori 
probability is recognized (ST5). Each of the above 
processing functions is controlled by means of the 
CPU 1. 

2 5 When character pattern delimiting processing is 

performed in accordance with the step 3, a word break 
may be judged based on a character pattern for each 
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character and a gap in size between the patterned 
characters. In addition, it may be judged whether or 
not character spacing is provided based on the gap in 
size . 

A word recognition method according to an 
embodiment of the present invention is achieved in such 
a system configuration. Now, an outline of the word 
recognition method will be described below. 

1 . OUTLINE 

For example, consider character reading by an 
optical character reader. Although no problem will 
occur when the character reader has high character 
reading performance, and hardly makes a mistake, 
for example, it is difficult to achieve such high 
performance in recognition of a handwritten character. 
Thus, recognition precision is enhanced by using 
knowledge of words. Specifically, a word that is 
believed to be correct is selected from a word 
dictionary. Because of this, a certain evaluation 
value is calculated for each word, and a word with 
its highest (lowest) evaluation value is obtained as 
a recognition result. Although a variety of evaluation 
functions as described previously are proposed, a 
variety of problems as described previously still 
remain unsolved. 

In the present embodiment, a posteriori 
probability considering a variety of problems as 
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described previously is used as an evaluation function. 
In this way, all data concerning a difference in the 
number of characters, the ambiguity of word delimiting, 
the absence of character spacing, noise entry, and 
5 character break can be naturally incorporated in one 

evaluation function by calculation of probability. 

Now, a general theory of Bayes Estimation used in 
the present invention will be described below. 
2. GENERAL THEORY OF BAYES ESTIMATION 
10 An input pattern (input character string) is 

defined as "x". In recognition processing, certain 
processing is performed for "x", and the classification 
result is obtained. This processing can be roughly 
divided into the two processes below. 
15 (1) "Characteristic r (= R (x)) is obtained by 

multiplying characteristics extraction processing R for 
obtaining any characteristic quantity relevant to "x" . 

(2) The classification result "ki" is obtained by 
using any evaluation method relevant to the 
20 characteristic "r". 

The classification result "ki" corresponds to the 
"recognition result". In word recognition, note that 
the "recognition result" of character recognition is 
used as one of the characteristics. Hereinafter, the 
25 terms "characteristics" and "recognition result" are 

used distinctly. 

The Bayes Estimation is used as an evaluation 
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method in the second process. A category "ki" with its 
highest posteriori probability P (ki | r) is obtained 
as a result of recognition. In the case where it is 
difficult or impossible to directly calculate the 
5 posteriori probability P (ki | r), the probability is 

calculated indirectly by using Bayes Estimation Theory, 
i.e., the following formula 



10 A denominator P (r) is a constant that does not depend 

on "i". Thus, a numerator P (p | ki) P (ki) is 
calculated, whereby a magnitude of the posteriori 
probability P (ki | r) can be evaluated. 

Now, for a better understanding of the following 

15 description, a description will be given to the Bayes 

Estimation in word recognition when the number of 
characters is constant. In this case, the Bayes 
Estimation is effective in English or any other 
language in which a word break may occur. 

2 0 3. BAYES ESTIMATION WHEN THE NUMBER OF CHARACTERS 

IS CONSTANT 

3 . 1 Definition of Formula 

This section assumes that character and word 
delimitings are completely successful, and the number 
25 of characters is fixedly determined without noise entry 

between characters. The following formulas are 
defined. 



- 18 - 

• Number of characters L 

• Category set K = {k^} 

= / wi e w , w : Set of words with the 
number of characters L 
5 • wi = (Wii, / ) 

j • j-th character of wi Wij ^ C, C : 
Character set 

• Characteristics r = (r^, r2, r^, •• , rj^) 

r±: Character characteristics of i-th character 
10 (= character recognition result) 

(Example: first candidate, first to third 
candidates, candidates having a predetermined 
similarity, first and second candidates and its 
similarity or the like) 
15 In the foregoing description, "wa" may be 

expressed in place of " Wj_ " . 

At this time, assume that a written word is estimated 
based on the Bayes Estimation. 

, P(r|ki )P(ki ) 

2 0 P(ki r)=. I ^ ^ (2) 

P{r) 

P (r I ki) is represented as follows. 

L 

P(r|ki) = P(ri|wii)P(r2|wi2)-"P(rL|wiL) = H 

i = l 

(3) 

Assume that P (ki) is statistically obtained in 
2 5 advance. For example, reading an address of a mail is 

considered as depending on a position in a letter or 
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a position in line as well as statistics of address. 

Although P (r | ki) is represented as a product, 
this product can be converted into addition by using an 
algorithm, for example, without being limited thereto. 
5 This fact applies to the following description. 

3 . 2 Approximation for Practical Use 

A significant difference in performance of 
recognition may occur depending on what is used as 
a characteristic "ri" . 
10 3.2.1 When a first candidate is used 

Consider that a "character specified as a first 
candidate" is used as a character characteristic "ri". 
This character is defined as follows. 

• Character set C = {ci} 

15 Example) ci: Numeral ci: Alphabetical upper-case or 

lower-case letter 

• Character characteristic set E = {ei} 
ei = (the first candidate is "ci") 

• ri e E 

2 0 For example, assume that "alphabetical upper-case 

and lower-case letters + numerals" is a character set 
C. The types of characteristics "ei" and types of 
characters "ci" have n (C) = n (E) = 62 ways. Thus, 
there are 62^ combinations of (ei, cj). 62^ ways of P 

2 5 (ei I cj) are provided in advance, whereby the above 

formula 3 is used for calculation. Specifically, for 
example, in order to obtain P (ei | "A"), many samples 
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of "A" are supplied to characteristics extraction 
processing R, and the frequency of the generation of 
each characteristic "ei" may be checked. 
3.2.2 Approximation 

Here, the following approximations may be used. 
Vi,(ei|ci) = P (4) 

The above formulas 4 and 5 are approximations in which, 
in any character "ci", a probability at which a first 
candidate is the characters themselves is equally "p", 
and a probability at which the first candidate is the 
other characters is equally "q". At this time, the 
following result is obtained. 

p + {n(E) - 1 }q = 1 (6) 
This approximation assumes that a character string 
listing the first candidates is a result of preliminary 
recognition. This result corresponds to matching for 
checking how many words such character string coincides 
with each word "wa" . When the characters with "a" in 
number are coincident with each other, the following 
simple result is obtained. 

P(r|wi) = p^q^-^ <7) 
3.3 Specific Example 

For example, consider that a city name is read in 
address reading of mail P written in English as shown 
in FIG. 2. FIG. 4 shows the delimiting processing 
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result of a character pattern that corresponds to 
a portion at which it is believed that the city name 
identified by the above mentioned delimiting processing 
is written. This result shows that four characters 
5 are detected. A word dictionary 10 stores candidates 

of city names (words) by the number of characters. 
For example, a candidate of a city name (word) that 
consists of four characters is shown in FIG. 5. 
In this case, five city names each consisting of four 

10 characters are stored as MAIR (kl), SORD (k2), ABLA 

(k3), HAMA (k4), and HEWN (k5). 

Character recognition is performed for each 
character pattern shown in FIG. 4 by the above 
described character recognition processing. 

15 A posteriori probability for each of the city names 

shown in FIG. 5 is calculated on the basis of the 
character recognition result of such each character 
pattern. 

Although characteristics (= character recognition 
2 0 results) used for calculation are various, an example 

using characters of a first candidate is shown here. 
In this case, the character recognition result is "H, 
A, I, A" in order from the left-most character, 
relevant to each character pattern shown in FIG. 4. 
25 In this way, from the above formula 3, a probability P 

(r I kl ) the probability that the character recognition 
result "H, A, I, A" shown in FIG. 4 will be produced 
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when the actually written character is "MAIR (kl)", 

P(r|ki ) = P(" H" I" M")P{ " A" I" A")P( " I" I" I")P( " A" |" R") (8) 
As described in subsection 3.2.1, the value of each 
5 term on the right side is obtained in advance by 

preparing a probability table. Alternatively, by using 
approximation described in subsection 3.2.2, namely, 
for example, when p = 0.5 and n (E) = 26, q = 0.02. 
Thus, the calculation result is obtained as follows. 
10 P(r I kl) = q-p-p-q = 0.0001 (9) 

That is, a probability P (r | kl) at which the 
city name MAIR (ki) relevant to the character 
recognition result "H, A, I, A" is the result of word 
recognition is 0.0001. 
15 Similarly, the following results are obtained. 

P(r I k2) = q-q-q-q = 0.00000016 

P(r I k3) = q-q-q-p = 0.000004 

P{r I k4) = p-p-q-p = 0.0025 

P{r I k5) = p-q-q-q = 0.000004 (10) 
2 0 The probability P (r|K2) that the character 

recognition result "H, A, I, A" shown in FIG. 4 will be 
produced when the actually written character is "SORD 
(k2)", is 0.00000016. 

The probability P (r|K3) that the character 
25 recognition result "H, A, I, A" shown in FIG. 4 will be 

produced when the actually written character is "SORD 
(k3)", is 0.000004. 

The probability P (r|K4) that the character 
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recognition result "H, A, I, A" shown in FIG. 4 will be 
produced when the actually written character is "SORD 
(k4)", is 0.0025. 

The probability P (r|K5) that the character 
5 recognition result "H, A, I, A" shown in FIG. 4 will be 

produced when the actually written character is "SORD 
(k5)", is 0.000004. 

Assuming that P (kl) to P (k5) are equal to 
each other, the magnitude of a posteriori probability 

10 P (ki I r) is equal to P (r | ki) from the above 

formula 2. Therefore, the formulas 9 and 10 may be 
compared with each other in magnitude. The largest 
probability is P (r | k4), and thus, the city name 
written in FIG. 2 is estimated as HAMA. A description 

15 will now be given of the probability table 11. FIG. 6 

shows how the approximation described in section 3.2.2 
is expressed in the form of a probability table. 
The characters are assumed to be 2 6 upper-case 
alphabetic characters. In FIG. 6, the vertical axis 

20 indicates actually written characters, while the 

horizontal axis represents their character recognition 
results. For example, the intersection between 
vertical line "M" and horizontal line "H" in the 
probability table 11 represents the probability 

25 P{"H"|"M"), at which the character recognition result 

becomes "H" when the actually written character is "M." 
In the approximation described in section 3.2.2., 



the probability of each character recognition result 
correctly representing the actually written character 
is assumed to be "p" . This being so, the diagonal 
line between the upper left corner of the probability 
table 11 and the lower right corner thereof is constant. 
In the case of FIG. 6, the probability is 0.5. 
Likewise, in the approximation described in section 
3.2.2., the probability of each character recognition 
result representing a character other than the actually 
written character is assumed to be "q". This being so, 
the diagonal line between the upper left corner of the 
probability table 11 and the lower right corner 
thereof is constant. In the case of FIG. 6, the 
probability is 0.02. 

As a result of using approximation described in 
subsection 3.2.2, a city name with its more coincident 
characters among city names contained in the word 
dictionary 10 shown in FIG. 5 and among the city names 
obtained by the character recognition shown in FIG. 4, 
is selected. Without using approximation described in 
subsection 3.2.2, as described in subsection 3.2.1, in 
the case where each P (ei | cj) is obtained in advance, 
and then, the obtained value is used for calculation, 
a city name with its more coincident characters is not 
always selected. 

For example, a comparatively large value is in the 
first term of the above formula 8 because H and M is 
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similar to each other in shape. Thus, the following 
result is obtained. 

P( " M" I" M") = 0.32, P( " H" I" M") = 0.2, 
P( " H" I" H") = 0 . 32, P{ " M" I" H") = 0.2, 
5 Similarly, a value in the fourth term is obtained in 

accordance with the following formulas 

P( " R" I" R") = 0 . 42, P( " A" I" R") = 0.1, 

P( " A" I" A") = 0 . 42, P( " R" j" A") = 0.1 
With respect to the other characters , approximation 
10 described in subsection 3.2.2 can be used. The 

probability table 11 in this case is shown in FIG. 7, 

At this time, the following result is obtained. 

P{r|ki ) = P( " H" I" M") • p{ " A" I" A") • p • P( " A" |" R") = 0.0042 

P{r|k2) = q • q • q • q = 0.00000016 

P{r|k3) = q • q • q • P( " A" |" A") = 0.00000336 

P{r|k4 ) = P( " H" I" H") • P( " A" I" A") • q • P( " A" |" A") « 0 . 0011 

P(r|k5) = P("H"|"H") • q ■ q • q = 0.00000256 

(11) 

15 In this formula, P (r | kl) includes the largest value, 

and a city name estimated to be written on a mail P 
shown in FIG. 2 is MAIR. 

Now, a description is given to the Bayes 
Estimation in word recognition when the number of 

2 0 characters is not constant according to the first 

embodiment of the present invention. In this case, the 
Bayes Estimation is effective in Japanese or any other 
language in which no word break occurs. In addition, 
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in a language in which a word break occurs, the Bays 
Estimation is effective in the case where a word 
dictionary contains a character string consisting of 
a plurality of words. 
5 4 . BAYES ESTIMATION WHEN THE NUMBER OF CHARACTERS 

IS NOT CONSTANT 

In reality, although there is a case in which a 
character string of a plurality of words is contained 
in a category (for example, NORTH YORK), a character 

10 string of one word cannot be compared with a character 

string of two words in the method described in 
chapter 3. In addition, the number of characters is 
not constant in a language (such as Japanese) in which 
no word break occurs, the method described in chapter 3 

15 is not used. Now, this section describes a word 

recognition method that corresponds to a case in which 
the number of characters is not always constant. 
4.1 Definition of Formulas 

An input pattern "x" is defined as a plurality of 
2 0 words rather than one word, and Bayes Estimation is 

performed in a similar manner to that described in 
chapter 3. In this case, the definitions in chapter 3 
are added and changed as follows. 
Changes : 

2 5 -An input pattern "x" is defined as a plurality 

of words. 

• L: Total number of characters in the input 
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pattern "x" 

• Category set K = {ki} 
ki = (w^, h) 

w j e w' , w' : A set of character strings having 
the number of characters and the number of words that 
can be applied to input "x" 

h: A position of a character string " w j " in the 
input "X". A character string " w j " starts from (h + 
l)-th character from the start of the input "x" . 

In the foregoing description, wb may be expressed 
in place of Wj . 
Additions : 



Total number of characters in character 



string " wj " 

^'j'k. • k-th character of w'j Wjj^ e c 
At this time, when Bayes Estimation is used, 
a posteriori probability P (ki | r) is equal to that 
obtained by the above formula 2 . 

P(r|ki)P(ki) 

P(ki|r) = ^ (12) 

P(r) 

P (r [ ki) is represented as follows. 



P(r|ki) = P(ri,r2,---,rh|]ci) 

• P(rh + 1 1 w'ji )P(rh + 2 1 w j2 ) • • • P(rh + L j |w j l j ) 

• P<rh + Lj+l/rh + Lj+2A"-.rL|ki) 

,1 

= P(ri,r2,---,rh|ki)j nP<^+k|wjk) 

[k = l J 

■ P<^ + Lj+l'rh + Lj+2.-"/rL|ki) (13) 

Assume that P (ki) is obtained in the same way as that 
described in chapter 3. Note that n (K) increases more 
significantly than that in chapter 3, and thus, a value 
of P (ki) is simply smaller than that in chapter 3. 

4 . 2 Approximation for Practical Use 

4.2.1 Approximation relevant to a portion free of 
any character string and normalization of the number of 
characters 

The first term of the above formula 13 is 
approximated as follows . 

P(ri,r2, --,rh|ki) « P<ri, r2, • •• , rh ) 

« P(ri)P(r2)-- P(rh) (14) 

Approximation of a first line assumes that there is 
ignored an effect of "wb" on a portion to which a 
character string "wb" of all the characters of the 
input pattern "x" is applied. Approximation of a 
second line assumes that each "rk" is independent". 
This is not really true. These approximation is 
coarse, but is very effective. 

Similarly, when the third term of the above 
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formula 13 is approximated,, the formula 13 is changed 
as follows. 

P(r|ki) = nPt^+k|wjk) Yl^irk) (15) 

k=l l<k<h 

h+Lj+l<k<L 

Here, assume a value of P (ki | r) / P (ki). This 
value indicates how a probability of "ki" increases or 
decreases by knowing a characteristic "r". 

P(ki|r) ^ P(r|ki) 
P(ki) ~ P(r) 

np(rh+k|wjk) np<^) 



h+LT+l<k<L 



(16) 



k = l 

^ P(rh+k|wjk) 

Approximation using a denominator in line 2 of the 
formula 16 is similar to that obtained by the above 
formula 14. 

This result is very important. At the right side 
of the above formula 16, there is no description 
concerning a portion at which the character string "wb" 
of all the characters is not applied. That is, the 
above formula 16 is not associated with what the 
input pattern "x" is. From this fact, it is found that 
P (ki I r) can be calculated by using the above formula 
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16 without worrying about the position and length of 

the character string "wb" , and multiplying P (ki). 

A numerator of the above formula 16 is the same as 

that of the above formula 3, namely, P (r | ki) when 
5 the number of characters is constant. This means that 

the above formula 16 performs normalization of the 

number of characters by using the denominator. 
4.2.2 When a first candidate is used 
Here, assume that characters specified as a first 
10 candidate is used as a characteristic as described in 

subsection 3.2.1. The following approximation of P 

(rk) is assumed. 

^"^' = ^, 

15 In reality, although there is a need to consider the 

probability of generation of each character, this 
consideration is ignored here. At this time, when 
the above formula 16 is approximated by using the 
approximation described in subsection 3.2.2, the 

2 0 following result is obtained. 

P{ki|r) a Ln 

where normalization is effected by n (E)Lj. 
4.3 Specific Example 
25 For example, consider that a city name is read in 

mail address reading when: 

• there exists a city name consisting of a 
plurality of words in a language (such as English) in 
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which a work break occurs ; and 

• when a city name is written in a language (such 
as Japanese) in which no word break occurs. 

In the foregoing, the number of characters of each 
5 candidate is not constant. For example, consider that 

a city name is read in address reading of mail P 
written in English as shown in FIG. 8. FIG. 9 shows 
the delimiting processing result of a character pattern 
that corresponds to a portion at which it is believed 

10 that the city name identified by the above described 

delimiting processing is written, wherein it is 
detected that a word consisting of two characters is 
followed by a space, and such space is followed by 
a word consisting of three characters. The word 

15 dictionary 10, as shown in FIG. 10, stores all the city 

names having the number of characters or the number of 
words applied in FIG. 9. In this case, five city names 
are stored as COH (kl), LE ITH (k2), OTH (k3), SK (k4), 
and STLIN (k5) . 

2 0 Character recognition is performed for each 

character patterns shown in FIG. 9 by the above 
described character recognition processing. 
The posteriori probability is calculated by each city 
name shown in FIG. 10 on the basis of the character 

25 recognition result obtained by such each character 

pattern. 

Although characteristics used for calculation 
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(= character recognition results) are various, an 
example using characters specified as a first candidate 
is shown here. In this case, the character recognition 
result is S, K, C, T, H in order from the left-most 
character relevant to each character pattern shown in 
FIG. 9. When approximation described in subsection 
4.2.1 is used, in accordance with the above formula 16, 
a posteriori probability P (ki | r) . That the last 
three characters are "COH" when the character 
recognition result is "S, K, C, T, H". 

P(ki[r) ^ P( " C" I" C") P(" T" |"0") P( " H" I" H") 
P(ki) ~ P("C") P("T") P("H") ^■'■^^ 

Further, in the case where approximation described in 

subsections 3.2.2 and 4.2.2 is used, when p = 0.5 and 

n (E) = 26, q = 0.02. Thus, the following result is 

obtained. 

P(ki|r) 3 

^ « p • q • p • n(Er ^ 87.88 (20) 

P(ki) 

Similarly, the following result is obtained. 

P(k2|r) 5 
^„ \ « q • q • q ■ p - p ■ n(E)^ 23 . 76 
P{k2) 

P(k3|r) 

P(k3) 

(21) 

-— ^ « p - p. n(E)^ = 169 
P(k4) 

P(k5|r) 5 

« p • q • q • q • q • n{E)^ « 0.95 

P(k5) 

In the above formula, "k3" assumes that the right three 
characters are OTH, and "k4" assumes that the left two 
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characters are SK. 

Assuming that P (ki) to P (k5) are equal to each 
other, with respect to the magnitude of the posteriori 
probability P (ki | r), the above formulas 21 and 
22 may be compared with each other in magnitude. 
The highest probability is P (k | r), and thus, the 
city name written in FIG. 8 is estimated as SK. 

without using approximation described in 
subsection 3.2.2, as described in subsection 3.2.1, 
there is shown an example when each P (ei | cj) is 
obtained in advance, and then, the obtained value is 
used for calculation. 

Because the shapes of C and L, T and I, and H and 
N are similar to each other, it is assumed that the 
following result is obtained. 

P("C"|"C") = P("L"|"L") - P("T"|"T") - P("I"|"I") 

= P( " H" I" H") = P( " N" I" N") = 0.4 
P( " C" I" L") - P( " L" I" C") = P( " T" I" I") = P( " I" I" T") 
= P( " N" I" H") = P( " H" I" N") = 0.12 
Approximation described in subsection 3.2.2 is met with 
respect to the other characters . The probability table 
11 in this case is shown in FIG. 11. At this time, the 
following result is obtained. 
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^ « P("C" "C") • q • P{"H" "H") • n{Er ■■ 

P(ki) 

P(k2|r) 

P(k2) 
P(k3|r) 

P(k3) 

« p • p • n(Er = 169 

P(k4) 

P(k5|r) 
P(k5) 



q • q • q • P( " T" |" T") • P( " H" |" H") • n(E)- 



■ P("T" " T") • P("H"|"H") • n(E)-^ 



■ q • P("C"|"L") • P("T"|"I") 



5 



■ 205.3 

(22) 

In this formula, P (k5 | r) / P (k5) includes the 
largest value, and the city name estimated to be 
written in FIG. 8 is ST LIN. 

In this way, in the first embodiment, recognition 
processing is performed by each character for an input 
character string that corresponds to a word to be 
recognized; there is obtained a probability of the 
generation of characteristics obtained as the result of 
character recognition by conditioning characters of the 
words contained in a word dictionary that stores in 
advance candidates of words to be recognized; the thus 
obtained probability is divided by a probability of the 
generation of characteristics obtained as the result 
of character recognition; each of the above division 
results obtained for the characters of the words 
contained in the word dictionary is divided for all 
the characters; and the above word recognition result 
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is obtained based on each of such division results. 

That is, in word recognition using the character 
recognition result, even in the case where the number 
of characters in a word is not constant, word 
5 recognition can be performed precisely by using an 

evaluation function based on a posteriori probability 
that can be used even in the case where the number of 
characters in a word is not always constant. 

Now, a description will be given to Bayes 

10 Estimation according to a second embodiment of the 

present invention, the Bayes Estimation being 
characterized in that, when word delimiting is 
ambiguous, such ambiguity is included in calculation of 
the posteriori probability. In this case, the Bayes 

15 Estimation is effective when error detection of word 

break cannot be ignored. 

5. INTEGRATION OF WORD DELIMITING 
In a language (such as English) in which a word 
break occurs, the methods described in the foregoing 

2 0 chapters 1 to 4 assume that a word is always identified 

correctly. If the number of characters is changed 
while this assumption is not met, these methods cannot 
be used. In this chapter, the result of word 
delimiting is treated as a probability without being 

25 defined as absoluteness, whereby the ambiguity of word 

delimiting is integrated with the Bayes Estimation in 
word recognition. A primary difference from chapter 4 
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is that consideration is taken into characteristics 
between characters obtained as the result of word 
delimiting. 

5.1 Definition of Formulas 

This section assumes that character delimiting 
is completely successful, and no noise entry occurs. 
The definitions in chapter 4 are added and changed as 
follows . 
Changes 

• An input pattern "x" is defined as a line. 

• L: Total number of characters in the input 
line "X" 

• Category set K = {ki> 

^i = (^j' h), wj e w, w : A set of all 
candidates of character strings (The number of 
characteristics is not limited. ) 

h: A position of a character string " w j " in an 
input line "x" . A character string Wj starts from (h + 
l)-th character from the start of an input pattern "x". 

In the foregoing description, "wc" may be 
expressed in place of " wj " . 
Additions 

• = (wji, Wj2, •", WjLj , w^o. ^'jlf ^'j2' ■ 
WjLj-1. w^L- ) 

L j : Number of characters in character string " wj " 
wjk : k-th character "wjk ^ C" of character 
string " wj " 
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wjk : Whether or not a word break occurs k-th 
character and (k + l)-th character of character 
string " Wj " 

wjk e s, S = {Sq, si( , S2) } 
5 Sq: Break 

si'. No break 

(S2: Start or end of line) 

(S2 is provided for representing the start or end 
10 of line in the same format, and is not essential.) 

Change 

• Characteristic "r" = (rc, rs) 

rc: Character characteristics, and rs; 
Characteristics of character spacing 
15 Addition 

• Character characteristics r^ = {rQi, rQ2f ^C3 ' 
rcL) 

rci: Character characteristics of i-th character 
(= character recognition result) 
20 (Example: First candidate; first to third 

candidates; candidate having predetermined similarity, 
and first and second candidates and their similarity 
and the like) 

• Character spacing characteristics rs = i^sO' ^Sl' 
25 rs2/ •••/ rsL) 

rs-j_: Characteristics of character spacing between 
i-th character and (i+l)-th character 
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At this time, the posteriori probability 
P (ki I r) can be represented by the following formula. 
P(ki|r) = P(ki|rc,rs) 

^ P<rc,rs|ki)P(ki) 
P{rc, rs) 

= P<rc|rs.ki)P(rs|ki)P(ki) ^^^^ 
P(rc, rs) 

In this formula, assuming that P {rs | ki) and 

P (rc I ki) are independent of each other (this 

means that character characteristics extraction and 

characteristics of character spacing extraction 

are independent of each other), P (rc | rs, ki) = 

P (rc I ki). Thus, the above formula 2 3 is changed as 

follows . 

P{rc, rs) 

P (rc I ki) is substantially similar to that obtained 
by the above formula 13. 

P(rc|ki) = P(rci,rc2,-",rch|ki) 

• P(rch + 1 |w ji )P(rch + 2 \^j2 > ' ' ■ P(rch + L j |w j L j ) 
■ P(rch + Lj+l/-".rcL|ki) 

, 1 

- P(rci,rc2,'--,rch|ki)j fl P<^Ch+kpjk) ^ 

• P(rch + Lj+l/-"/rcL|ki) 

(25) 

P (rs I ki) is represented as follows. 



- 39 



P(rs|ki) = P{rsi,rs2,---,rsh-l|ki) 

• Pt^Sh 1^ jO )P(rsh + 1 jl ) • • • P(rsh+ L j j L j > 



• P(rsh + Lj+l'"-^rsh-l|ki) 



= P{rsi,rs2,---,rsh-l|ki) 11 P(^Sh+k|wjk) ^ 

[k=0 J 

• P(rsh+Lj+l/---.rsL-l|ki) 

(26) 

Assume that P (ki) is obtained in a manner similar to 
that described in chapters 1 to 4 . However, in 
general, note that n (K) increases more significantly 
than that described in chapter 4 . 

5.2 Approximation for Practical Use 

5.2.1 Approximation relevant to a portion free of 
a character string and normalization of the number of 
characters 

When approximation similar to that described in 
section 4.2.1 is used, the following result is 
obtained. 

P(rc|ki)= nP<^Ch + kpjk) n^t^Ck) (27) 

k=l l<k<h 

h+Lj+l<k<L 

Similarly, the above formula 26 is approximated as 
follows . 

P(rs|ki) = nP(^Sh + k|wjk> n^t^Sk) (28) 

k=0 l<k<h-l 
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When a value of P (ki | r) / P (ki) is considered in 
a manner similar to that described in subsection 4.2.1, 
the formula is changed as follows . 

P(ki|r) ^ P(rc|ki)P(rg|]ci) 
P(ki) P(rc,rs) 

P(rc|ki) P(rs|ki) 
P(rc) P(rs) 

^ P(ki|rc) P(ki|rs) 
P(ki) P(ki) 

A first line of the above formula 2 9 is in accordance 
with the above formula 24. A second line uses 
approximation obtained by the following formula. 
P(rc,rs) « P(rc)P(rs) 

The above formula 29 shows that a "change caused 
by knowing 'characteristics' of a probability of ' ki ' " 
can be handled independently according to "rc and "rs". 
The probability is calculated below. 

P(ki|rc) ^ P(rc|ki) 
P(ki) P(rc) 

ript^ch+kpjk) np<^ck) 

k=l l<k<h 

h + L-i + l<k<L 



L 

np<^ck) 

k = l 

^ P(j^Ch+k|wjk) 

kil p(rch+k) 



(30) 
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P(ki|rs) ^ P(rs|ki) 
P(ki) P(rs) 

L. 



np<^sh+kpjk) np<^sk) 

k=0 l<k<h-l 

h+L^+l<k<L-l 
2 PI) 

np<^sk) 

k = l 



k = 0 



P(rsh + k) 



Approximation used by a denominator in the second line 
of each of the above formulas 30 and 31 is similar to 
that obtained by the above formula 14. In the third 
line of the formula 31, rsO and rsL are always at the 
start and end of the line (d3 shown in an example of 
the next section 5.2.2), P (rsO = P (rsL) = 1. 

From the foregoing, the following result is 
obtained. 

P(ki|r) P(J^Ch + kkj]c) ^<^Sh+k|>^'jk> 

3^ = TT L^! — TT L_r!_ (32) 

P<rch + k) k = 0 ^<^Sh + k> 
As in the above formula 16, in the above formula 
32 as well, there is no description concerning a 
portion to which a character string "wc" is not 
applied. That is, in this case as well, "normalization 
caused by a denominator" can be considered. 

5.2.2 Example of characteristics of character 
spacing "rs" 

An example of characteristics are defined as 



follows . 

• Characteristics of character spacing set D = 
{dO, dl, d2, (, d3)} 

do : Expanded character spacing 
dl : Condensed character spacing 
d2: No character spacing 

(d3: This denotes the start or end of the line, 
and always denotes a word break. ) 

• rs e D 

At this time, the following result is obtained. 

P(dk|si)k ^0,1,2 1=0,1 
The above formula is established in advance, whereby 
the numerator in the second term of the above formula 
32 can be obtained by the formula below. 

P<^Sh + k|wjk) 
where P (d3 | s2) = 1. 

In addition, the formula set forth below is 

established in advance, whereby the denominator P (rsk) 

in the second term of the above formula 32 can be 

obtained. 

P(dk)k= 0,1,2 

5.3 Specific Example 

As in subsection 4.3, consider that a city name is 
read in address reading of a mail in English. 

For example, consider that a city name is read in 
address reading of mail P written in English, as shown 
in FIG. 12. FIG. 13 shows the delimiting processing of 
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a character pattern that corresponds to a portion at 
which it is believed that the city name identified by 
the above described delimiting processing is written, 
wherein a total of five characters are detected. It is 
5 detected that the first and second characters are free 

of being spaced from each other; the second and third 
characters are expanded in spacing; and the third and 
fourth characters and the fourth and fifth characters 
are condensed in spacing. FIG. 14A, FIG. 14B, and 

10 FIG. 14C show the contents of the word directory 10, 

wherein all city names are stored. In this case, three 
city names are stored as ST LIN shown in FIG. 14A, SLIM 
shown in FIG. 14B, and SIM shown in FIG. 14C. The sign 
(sO, si) described under each city name denotes whether 

15 or not a word break occurs. sO denotes a word break, 

and si denotes no word break. 

FIG. 15 illustrates a set of categories. Each 
category includes position information, and thus, is 
different from the word dictionary 10. A category kl 

2 0 is made of a word shown in FIG. 14A; categories k2 and 

k3 are made of words shown in FIG. 14B; and categories 
k4, k5, and k6 are made of words shown in FIG. 14C. 
Specifically, the category 1 is made of "STLIN"; the 
category 2 is made of "SLIM"; the category 3 is made of 

25 " SLIM"; the category k4 is made of "SLIM "; the 

category kS is made of " SIM "; and the category k6 is 
made of " SLIM" . 



Character recognition is performed for each 
character pattern shown in FIG. 13 by the above 
described character recognition processing. The 
character recognition result is used for calculating 
5 the posteriori probability of each of the categories 

shown in FIG. 15. Although characteristics used for 
calculation (= character recognition result) are 
various, an example using characters specified as a 
first candidate is shown here. 

10 In this case, the five characters "S, S, L, I, M" 

from the start (leftmost character) are obtained as 
character recognition results for each of the character 
patterns shown in FIG. 13. 

Although a variety of characteristics of character 

15 spacing are considered, an example described in 

subsection 5.2.2 is shown here. FIG. 13 shows 
characteristics of character spacing. The first and 
second characters are free of being spaced from each 
other, and thus, the characteristics of character 

20 spacing are defined as "d2". The second and third 

characters are expanded in spacing, and thus, the 
characteristics of character spacing are defined 
as "dO". The third and fourth characters and the 
fourth and fifth characters are condensed in spacing, 

25 the characteristics of character spacing are defined 

as "dl". 

When approximation described in subsection 5.2.1 
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is used, in accordance with the above formula 30, 
a change P (kl | rc) / P (kl) in a probability of 
generating a category kl, the change caused by knowing 
the character recognition result "S, S, L, I, M" , is 
5 obtained by the following formula. 

P(ki|rc) ^ P( " S" I" S") P( " S" I" T") 
P(ki) ~ P("S") P(" S") 

(33) 

P(" L" I" L") P{ " I" I" I") P( " M" I" N") 
P( " L" ) P( " I" ) P( " M" ) 

In accordance with the above formula 31, P (kl | rs) / 
P (kl) of the probability of an occurrence of category 
kl, a change caused by characteristics of character 
10 spacing shown in FIG. 14, is obtained by the following 

formula. 

P(ki|rs) ^ P(d2|si) P(do|so) P(di|si) P(di|si) 
P(ki) ~ P(d2) P(do) P{di) P(di) ^^^^ 

If approximation described in subsections 3.2.2 

and 4.2.2 is used to make calculation in accordance 

15 with the above formula 33, for example, when p = 0.5 

and n (E) = 26, q = 0.02. The above formula 33 is 

computed as follows . 

P(ki|r^) 5 

, « p • q • p ■ p • q • n(Er « 594 (35) 
P(ki) 

In order to make communication in accordance with the 
2 0 above formula 34, it is required to obtain the 

following formula in advance. 

P(dk|si)]5.^ 0,1,2 1=0,1 and P(dk)k =0,1,2 
As an example, it is assumed that the following values 
25 in tables 1 and 2 are obtained. 
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Table 1 lists values obtained by the following formula. 
P(dk r^ si) 

Table 2 lists the values of P (dk | si). In this case, 
note that a relationship expressed by the following 
formula is met. 

P(dk n si) = P(dk|si)p(si) 
In reality, P {dk | si) / P (dk) is required for 
calculation using the above formula 34, and thus, 
the calculations are shown in table 3 below. 
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The above formula 34 is used for calculation as follows 
based on the values shown in table 3 above. 
P(ki|r^) 

^ « 1.22 • 4 • 1.16 ■ 1.16 « 6.57 (36) 

P{ki) 

From the above formula 29, a change P {kl | r) / P (kl) 
in a probability of generating the category kl, the 
change caused by knowing the characteristics recogni- 
tion result "S, S, L, I, M" and the characteristics of 
character spacing is represented by a product between 
the above formulas 35 and 36, and is obtained by 
formula. 

^^'^^^^^ « 594 • 6 . 57 « 3900 (37) 
P(ki) 

Similarly, p (ki | rc) / P (ki), P (ki | rs) / P (ki), 
P (ki ] r) / P (ki) are obtained with respect to k2 to 
k6 as follows. 

P(k2kc) 4 

f ' ^ « p • q • q • q • n(E)^ «1.83 
P(k2) 

Ptkslrc) 4 

^J-^ « p • p • p • p • n(E)^ « 28600 

P(k3) 

P(k4|rc) 
P(k4) 

P(k5|rc) 
P(k5) 

Ptkgkc) 3 
-^.<,.p.p.n<E,3 



(38) 
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« 1.22 • 0.25 • 1.16 • 0.35 « 0.124 

P{k2) 

=^-5- « 0.14 • 0.25 • 1.16 • 1.16 « 0.0471 

P(k3) 

^^-^ « 1.22 • 0.25 • 0.35 « 0.107 

P(k4) 

PCkcIro) 

« 0.14 • 0,25 • 1.16 • 0.35 « 0.0142 

P(k5) 

P(k6|rs) 
P(k6) 

^^« 1. 83. 0.124, 
P(k2) 



4 • 1.16 • 1.16 « 5.38 



P(k3) 

P(k4k) ^ 3 52 . 0.107 » 0.377 (40) 
P(k4) 

^^^^^^^ « 3.52 ■ 0.0142 « 0.0500 
P(k5) 

^^« 87.9.5.38 «473 
P(k6) 

The maximum category in the above formulas 3 7 and 4 0 is 
"kl". Therefore, the estimation result is ST LIN. 

In the method described in chapter A, which does 
not use characteristics of character spacing, although 
the category "k3" that is maximum in the formulas 35 
and 38 is the estimation result, it is found that the 
category "kl" believed to comprehensively match best 
is selected by integrating the characteristics of 
character spacing. 

In this manner, in the second embodiment, the 
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input character string corresponding to a word to be 
recognized is identified by each character, and the 
characteristics of character spacing are extracted by 
this character delimiting. In addition, recognition 
5 processing is performed for each character obtained 

by the above character delimiting. Then, there is 
obtained a probability at which the characteristics 
obtained as the result of character recognition are 
generated by conditioning characteristics of characters 

10 and character spacing of words contained in a word 

directory that stores in advance words to be recognized 
and candidates of characteristics of character spacing 
in a word. The thus obtained probability is divided by 
a probability at which there are generated characteris- 

15 tics obtained as the result of character recognition. 

Then, each of the above obtained calculation results 
relevant to characters of words contained in a word 
dictionary and characteristics of character spacing 
is multiplied for all the characters and character 

2 0 spacing. The recognition result of the above word is 

obtained based on this multiplication result. 

That is, in word recognition using the character 
recognition result, an evaluation function is used 
based on a posteriori probability considering at least 

2 5 the ambiguity of word delimiting. In this way, even in 

the case where word delimiting is not reliable, word 
recognition can be performed precisely. 
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Now, a description will be given to Bayes 
Estimation according to a third embodiment of the 
present invention when no character spacing is provided 
or noise entry occurs. In this case, the Bayes 
Estimation is effective when no character spacing is 
provided or when noise entry cannot be ignored. 

6. INTEGRATION OF THE ABSENCE OF CHARACTER 
SPACING AND NOISE ENTRY 

The methods described in the foregoing chapters 1 
to 5 assume that character is always identified 
correctly. if no character spacing is provided while 
this assumption is not met, the above methods cannot be 
used. In addition, these methods cannot be used to 
counteract noise entry. in this chapter, the Bayes 
Estimation that counteracts the absence of character 
spacing or noise entry is performed by changing 
categories. 

6.1 Definition of Formulas 

Definitions are added and changed as follows based 
on the definitions in chapter 5. 
Changes 

• Category K = {ki} 

^i = (^jk' h), wjk e w, w: A set of derivative 
character strings 

In the foregoing description, "wd" may be 
expressed in place of "Wj]^". 
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Addition 

• Derivative character string 

Wjk = (wjkl^ ■Wjk2' ^jkLj]^' w'jkO' ^'jkl^ 

5 Lj]^: Number of characters in derivative character 

string "Wj}j" 

"^jkl* 1-th character Wj]^ g c of Wj]^ 

w'j]^: Whether or not a word break occurs between 

1 character and (1 + l)-th character W jj^i ^ S, W jj^o 

10 = jkLjk = ^0 

• Relationship between derivative character string 
wj)j and character string wj 

Assume that action ajki ^ A is acted between 1-th 
character and ( 1 + 1 ) character in character string Wj , 
15 whereby a derivative character string Wj]^ can be formed. 

A = {ag, 3^1, 32} ag: No action a]^: No character 
spacing a2 : Noise entry 

• aO : No action 

Nothing is done for the character spacing. 
20 • al: No character spacing 

The spacing between the two characters is not 
provided. The two characters are converted into one 
non-character by this action. 

Example: The spacing between T and A of ONTARIO 
25 is not provided. ON#RIO (# denotes a non-character by 

providing no character spacing. ) 

• a2 : Noise entry 
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A noise (non-character) is entered between the two 
characters . 

Example: A noise is entered between N and T 
of ONT. 

ON*T (* denotes a non-character due to noise.) 

However, when 1 = 0, Lj, it is assumed that noises 
are generated at the left and right ends of a character 
spring "wc", respectively. in addition, this 
definition assumes that noise does not enter two or 
more characters continuously. 

• Non-character y ec 

A non-character is identified as " y " by 
considering the absence of character spacing or noise 
entry, and is included in character C. 

At this time, a posteriori probability P (ki j r) 
is similar to that obtained by the above formulas 23 
and 24. 

P(rc,rs) ^"^^ 
P (pc I ki) is substantially similar to that obtained 
by the above formula 25. 

P(rc|ki) = P(rci,rc2,-",rch|ki)|n P(rch+l|wjkl)| 
• P(rch + Ljk +1. -"^rcLlki) 

(42) 

P (ps I ki) is also substantially similar to that 
obtained by the above formula 26. 
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P(rs|ki) = P(rsi,rs2,-",rsh-i|ki)|nP(rsh + l|w3kl)| 
• Pt^Sh+Ljk + l."wrsL-l|ki) 

(43) 

6.2 Description of P (ki) 

Assume that P (wc) is obtained in advance. Here, 
although P (wc) is affected by the position in a letter 
or the position in line if the address of the mail P is 
actually read, for example, the P (wc) is assumed to be 
assigned as an expected value thereof. At this time, 
a relationship between P (wd) and P (wc) is considered 
as follows. 

[Li-l 



P(Wjk) = P(wj)j Yl P(ajki)|p(ajko)P(ajkL. ) (44) 

That is, the absence of character spacing and noise 
entry can be integrated with each other in a frame of 
up to five syllables by providing a probability of the 
absence of character spacing P (al) and a noise entry 
probability P (*2). From the above formula 44, the 
following result is obtained. 



This formula is a term concerning whether or not noise 
occurs at both ends. In general, probabilities at 
which noises exist are different from each other 
between characters and at both ends. Thus, a value 
other than noise entry probability P (a2) is assumed to 
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be defined. 

A relationship between P (wc) and P (wc, h) or 
a relationship between P (wd) and P (wd, h) depends 
on how the effects as described previously (such as 
5 position in a leter) are modeled and/or approximated. 

Thus, a description is omitted here. 

6.3 Description of a Non-Character y 
Consider a case in which characters specified as 

a first candidate are used as character charac- 
10 teristics, as shown in subsection 3.2.1. When a 

non-character " y " is extracted as characteristics, 
the characters generated as a first candidate are 
considered to be similarly probable. Then, such 
non-character is handled as follows. 

15 

P<.i|r. = ^ (45, 

6.4 Specific Example 

As in subsection 5.3, for example, consider that 
a city name is read in address reading of a mail P in 

20 English, as shown in FIG. 17. 

In order to clarify the characteristics of this 
section, there is provided an assumption that word 
delimiting is completely successful, and a character 
string consisting of a plurality of words does not 

2 5 exist in a category. FIG. 17 shows the result of 

delimiting processing of a character pattern that 
corresponds to a portion at which it is believed that a 
city name identified by the above described delimiting 
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processing is written, wherein a total of five 
characters are detected. The word dictionary 10 stores 
all city names, as shown in FIG. 18. In this case, 
three city names are stored as SITAL, PETAR, and STAL. 
5 FIG. 19 illustrates a category set, wherein 

character strings each consisting of five characters 
are listed from among derivative character strings made 
based on the word dictionary 10. When all derivative 
character strings each consisting of five characters 

10 are listed, for example, "P#A*R" or the like deriving 

from "PETAR" must be included. However, in the case 
where a probability P (a) of the absence of character 
spacing or noise entry probability P (a2) described in 
section 6.2 is smaller than a certain degree, such 

15 characters can be ignored. In this example, such 

characters are ignored. 

Categories kl to k5 each are made of a word 
"SISTAL"; a category k6 is made of a word "PETAR"; and 
categories k7 to kll each are made of a word "STAL". 

20 Specifically, the category kl is made of "#STAL"; the 

category k2 is made of "S#TAL"; the category k3 is made 
of "SI#AL"; the category k4 is made of "SIS#L"; the 
category k5 is made of "SIST#"; the category k6 is made 
of "PETAR"; the category k7 is made of "*STAL"; the 

25 category k8 is made of "S*TAL"; the category k9 is made 

of "ST*AL"; the category klO is made of "STA*L"; and 
the category kll is made of "STA*L". 
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Character recognition is performed for each of 
the character patterns shown in FIG. 17 by the above 
described character recognition processing. The 
posteriori probability is calculated by each category 
5 shown in FIG. 19 by on the basis of the character 

recognition result obtained by such each character 
pattern. 

Although characters used for calculation (= 
character recognition result) are various, an example 

10 using characters specified as a first candidate is 

shown here. In this case, the character recognition 
result is "S, E, T, A, L" in order from the left-most 
character, relevant to each character pattern shown in 
FIG. 17. In this way, in accordance with the above 

15 formula 16, a change P {k2 | r) / P (k2) in a 

probability of generating the category k2 (S#TAL) shown 
in FIG. 2, the change caused by knowing the character 
recognition result, is obtained as follows. 

P(k2|r) ^ P("S"|"S") P("E"|"#") 

P(k2) ~ PC'S") P("E") 
20 (46) 
P( " T" I" T") P( " A" I" A") P( " L" I" L") 

P("T") P("A") P("L") 

Further, by using approximation described in section 
3.2 and subsection 4.2.2, for example, when p = 0.5 and 
n (E) = 26, q = 0.02. Thus, the above formula 4 6 is 
used for calculation as follows. 
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P(k2k) „ 1 5 



Referring now to the above calculation process, this 
calculation is equivalent to calculation of four 
characters other than non-characters. Similarly, 
the other categories are calculated. Here, k6, k7, 
and k8 easily estimated to indicate large values are 
calculated as a typical example. 

P(k6k) 5 
p^j^^^ « q - p • p ■ p • q • n(Er « 594 

P(k7|r ) 1 5 



P(k8r) 1 5 

^^.p.__.p.p.p.n(E)5 



In comparing these values, chapter 5 assumes that the 
values of P (ki) is equal to each other. However, in 
this section, as described in section 6.2, a change 
occur with P (ki) by considering the absence of 
character spacing or noise entry. Thus, all the values 
of P (ki) before such change occurs is assumed to be 
equal to each other, and P (ki) = PC is defined. PC 
can be considered to be P (wc) in the above formula 44. 
In addition, P (ki) after such change has occurred is 
considered to be P (wd) in the above formula 44. 
Therefore, p (ki) after such change has occurred is 
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obtained as follows. 



Li-1 



P(ki) = Poj n P(ajkl)|P(ajko)P(ajkLj ) (49) 

In this formula, assuming that a probability of 
the absence of character spacing P (al) = 0.05, a 
probability of noise entry into character space P (aO) 
= 0.002, a probability of noise entry into both ends is 
P' (a2) = 0.06, for example, P (k2) is calculated as 
follows . 

P(k2) = Po • 0.948 • 0.05 ■ 0.948 • 0.948 • 0.948 ■0.94-0.94 
^ 0 . 0357P0 (50) 
In calculation, a probability when neither character 
spacing nor noise entry occurs P (aO ) = 1 - p (al ) - 
P (a2) = 0.948 is used, and a probability free of noise 
entry at both ends P' (aO) = 1 - P' (a2) = 0.94 is 
used. 

Similarly, when P (k6), P (k7), and P (k8) are 
calculated, the following result is obtained. 

PCkg) = Pq ■ 0.948 • 0.948 • 0.948 • 0.948 •0.94-0.94 
0.714P0 

P(k7) = Pq • 0.948 - 0.948 • 0.948 -0.06-0.94 
« 0.0481P0 

P(k9) = Pq • 0 . 002 ■ 0 . 948 ■ 0 . 948 - 0 . 94 - 0 . 94 

0.00159P0 (51) 
When the above formulas 50 and 51 are changed by using 
the above formulas 4 7 and 48, the following result is 
obtained. 



- 62 



P(k2|r) « 28600 • 0.0357Po « 1020Po 
Pdcglr) « 594 • 0.714P0 ~ 424Po 

(52) 

P{k7|r) « 1140 • 0.0481PO « 54.8Po 
P(k8|r) « 28600 • 0.00159Po * ^S.SPq 
When the other categories are calculated similarly as 
a reference, the following result is obtained. 
P(ki|r) « 40.7Po,P(k3jr) « 40.7Po, 
P(k4|r) « 1.63Po,P(k5|r) « O.OSSSPq, 
P(k9|r) « 1.8lPo,P(kioIr) 0.0727Po, 
P(kii|r) « O.O88OP0 
From the foregoing, the highest posteriori probability 
is the category k2, and it is estimated that the city 
name written in FIG. 16 is SISTAL, and no character 
spacing between I and S is provided. 

As described above, according to the third 
embodiment, the characters of words contained in a word 
dictionary include information on non-characters as 
well as characters. In addition, a probability of 
generating words each consisting of characters that 
include non-character information is set based on 
a probability of generating words each consisting 
of characters that do not include any non-character 
information. In this manner, word recognition can be 
performed by using an evaluation function based on 
a posteriori probability considering the absence of 
character spacing or noise entry. Therefore, even in 
the case where no character spacing is provided or 
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noise entry occurs, word recognition can be performed 
precisely. 

Now, a description will be given to Bayes 
Estination according to a fourth embodiment of the 
present invention when a character is not identified 
uniquely. In this case, the Bayes Estimation is 
effective for characters with delimiters such as 
Japanese Kanji characters or Kana characters. 
In addition, the Bayes Estimation is also effective 
to calligraphic characters in English which includes 
a case where many break candidates other than actual 
character breaks must be presented. 

7. INTEGRATION OF CHARACTER DELIMITING 
The methods described in chapters 1 to 6 assume 
that characters themselves are not delimited. However, 
there is a case in which characters such as Japanese 
Kanji or Kana characters themselves are delimited into 
two or more. For example, in a Kanji character "5^", 
when character delimiting is performed, " 0 " and "M " 
are identified separately as character candidates. 
At this time, a plurality of character delimiting 
candidates appear depending on whether these two 
character candidates are integrated with each other or 
separated from each other. 

Such character delimiting cannot be achieved by 
the method described in chapters 1 to 6 . Conversely, 
in the case where many characters free of being spaced 



from each other are present, and are subjected to 
delimiting processing, the characters themselves as 
well as actual character contacted portions may be cut. 
Although it will be described later in detail, it would 
be better to permit cutting of characters themselves to 
a certain extent as a strategy of recognition. In this 
case as well, the methods described in characters 1 
to 6 cannot be used similarly. In this chapter, 
Bayes Estimation is performed which corresponds to 
a plurality of character delimiting candidates caused 
by character delimiting. 

7.1 Character Delimiting 

In character delimiting targeted for character 
contact, processing for cutting a character contact is 
performed. In this processing, when a case in which 
a portion that is not a character break is specified as 
a break candidate is compared with a case in which a 
character break is not specified as a break candidate, 
the latter affects recognition. The reasons are stated 
as follows. 

• When a portion that is not a character break is 
specified as a break candidate 

A case in which a character break is executed at 
a character break and a case in which such character 
break is not performed can be attempted. Thus, if two 
much breaks occur, correct character delimiting is not 
always performed. 



• When a character break is not specified as 
a break candidate 

There is no means for obtaining correct character 
delimiting. 

Therefore, in character delimiting, it is 
effective to specify many break candidates other than 
character breaks. However, when a case in which a 
character break is performed at a break candidate and a 
case in which such break is not performed is attempted, 
it means that there are a plurality of character 
delimiting patterns. In the methods described in 
chapters 1 to 6, comparison between different character 
delimiting pattern candidates cannot be performed. 
Therefore, a method described here is used to solve 
this problem. 

7.2 Definition of Formulas 

The definitions are added and changed as follows 
based on the definitions in chapter 6. 
Changes 

• Break state set S = {sO, si, s2, (, s3 ) } 
sO: Word break 

si: Character break 

s2: No character break (s3: Start or end of 

line ) 

"Break" defined in chapter 5 and subsequent means 
a word break, which falls into sO. "No break" falls 
into si and s2. 
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• L: Number of portions divided at a break 
candidate (referred to as cell) 

Addition 

• Unit uij (i ^ j) 

This unit is combined between i-th cell and 
(j - i)-th cell. 
Change 

• Category K = {ki> 

^i = (Wjk' mjk, h), wjk e w 

"^jk = (mjkl. nijk2^ ^jkLjk. "'jkLjk + 1) 

mjki: Start cell number of unit to which 

character "wj^i" applies. The unit can be expressed as 

"^mjkl^jkl+l- 

h: A position of a derivable character string 

"wjk". A derivative character string "wj^" starts from 

a (h + l)-th cell. 

Addition 

• Break pattern k'i = (k'iQ, k'ii, k'lL^) 
k'i: Break state in k^ 

Lq: Total number of cells included in all units 
to which a derivative character string "Wjk" applies. 

k'ii: State k'ii G s in a break between 
(h + l)-th cell and (h + 1 + l)-th cell 

sq (when a word break occurs, namely. 
When 3n, Wj^^ = sq, 1 = mj^n+i - h - 1) 
82 (when ^n, 1 mj^n - h - 1) 
si (when a case other than the above occurs) 
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Change 

• Character characteristics 

^C = (^C12' rci3, rci4, rciL+1. ^C23' ^C24r 

'"> rC2L+l' ■*■/ ^CLL+l) 
^Cnin2= Character characteristics of unit ^r\in2 

• Characteristics of character spacing rg = (rgor 

^Sn- Characteristics of character spacing between 
n-th cell and (n + l)-th cell 

At this time, a posterior probability P (ki | r) is 
similar to the above formulas 23 and 24. 

P(rc,rs) 

P (rc I ki) is represented as follows. 

P(rc|ki) = P(rcmjkimjk2hkl)P(rcmjk2nijk3hjk2)- 

• ^<^CmjkLjk"'jkLjk + i hkLjk> 
■^<-'-nin2'-iki) 

^jk 

n ^(^Cmjknnijkn + lhjkn) 
1 = 1 

^<-'^nin2'-|ki) 
ni,n2 

b,l < b < Ljk,(ni,n2) ^ (mjkb' "^jkb+l) 

(54) 

P (rs t ki) is represented as follows. 
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P(rs|ki) = P(rsi, rs2, ••• , rsh-l|ki) 

•P(rsh|kio)P(rsh + l|kii)-P(rsh + Lc l^iLc > 
• P(^Sh + Lc +!,•••, rsL-1 l^i ^ 

(55) 

In p (ki), "mjk" is contained in a category "ki" in 
this section, and thus, the effect of the "mjk" should 
5 be considered. Although it is considered that the 

"mjk" affect the shape of a unit to which individual 
characters apply, characters that apply to such unit, 
a balance in shape between the adjacent units or the 
like, a description of its modeling will be omitted 
10 here. 

7.3 Approximation for Practical Use 

7.3.1 Approximation relevant to a portion free of 

a character string and normalization of the number of 

characters 

15 When approximation similar to that in subsection 

4.2.1 is used for the above formula 54, the following 
result is obtained. 

n = l 

nP<^Cnin2) (56) 
ni,n2 

^b,l<b<Lj]5.,(ni,n2)^(injkb,mjkb + i) 
2 0 In reality, it is considered that there is any 

correlation among "r cnln3", "r cnln2", and "r cn2n3", 
and thus, this approximation is more coarse than that 



described in subsection 4.2,1. 

In addition, when the above formula 55 is 
approximated similar, the following result is obtained 

P(rs|ki)= nP<^Sh + nKn) IT ^^^S 

n=0 l<n<h-l 

h+Lc +l<n<L-l 

Further, when P (ki | r) / p (ki) is calculated in 

a manner similar to that described in subsection 5.2.1 

the following result is obtained. 

P(ki|r) ^ P(ki|rc) P(ki|rs) 
P(ki> ~ P(ki) P(ki) 

^ ^^^ P<rcm.jk„mj3,n + lhkn) ^ P(rsh + n|k^n) 
n=i ^<^Cinjknmjkn + l> n = 0 ^^''Sh + l) 

(58) 

As in the above formula 32, with respect to the 
above formula 58, there is no description concerning 
a portion at which a derivative character string "wd" 
applies, and "normalization by denominator" can be 
performed. 

7.3.2 Break and character spacing characteristics 

"rs" 

Unlike chapters 1 to 6 , in this subsection, s2 
(No character break) is specified as a break state. 
Thus, in the case where characteristics of character 
spacing set D is used as a set of character spacing 
characteristics in a manner similar to that described 
in subsection 5.2.2, the following result is obtained. 
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P(dk|si)k = 0,1,2 1 = 0,1,2 
It must be noted here that all of these facts are 
limited to a portion specified as "a break candidate", 
as described in section 7.1. s2 (No character break) 
means that a character is specified as a break 
candidate, but no break occur. This point should be 
noted when a value is obtained by using the formula 
below. 

P{dk|s2)k =0,1,2 

This applies to a case in which a value is obtained by 
using the formula below. 

P(dk)]c=o,l,2 

7.4 Specific Example 

As in section 6.4, consider that a city name is 
read in address reading of mail P written in English. 

For clarifying the characteristics of this 
section, it is assumed that word delimiting is 
completely successful; a character string consisting 
of a plurality of words does not exist in a category, 
no noise entry occurs, and all the character breaks 
are detected by character delimiting (That is, unlike 
section 6, there is no need for category concerning 
noise or space-free character). 

FIG. 20 shows a portion at which it is believed 
that a city name is written, and five cells are 
present. FIG. 21A to FIG. 2 ID show possible character 



delimiting pattern candidates. In this example, for 
clarity, it is assumed that the spacing between cells 2 
and 3 and the spacing between cells 4 and 5 are always 
found to have been delimited (a probability at which 
characters are not delimited is very low, and may be 
ignored) . 

The delimiting candidates are present between 
cells 1 and 2 and between cells 3 and 4. The possible 
character delimiting pattern candidates are exemplified 
as shown in FIG. 21A to FIG. 2 ID. FIG. 22 shows the 
contents of the word directory 10 in which all city 
names are stored. In this example, there are three 
candidates for city names. 

In this case, three city names are stored as 
BAYGE, RAGE, and ROE. 

FIG. 23A to FIG. 23D each illustrate a category 
set. It is assumed that word delimiting is completely 
successful. Thus, NAYGE applies to FIG. 21A; RAGE 
applies to FIG. 2 IB and FIG. 21C; and ROE applies to 
FIG. 21D. 

In the category kl shown in FIG. 23A, the interval 
between cells 1 and 2 and that between cells 3 and 4 
correspond to separation points between characters. 

In the category k2 shown in FIG. 23B, the interval 
between cells 1 and 2 corresponds to a separation point 
between characters, while the interval between cells 3 
and 4 does not. 
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In the category k3 shown in FIG. 2ZC, the interval 
between cells 3 and 4 corresponds to a separation point 
between characters, while the interval between cells 1 
and 2 does not. 

5 In the category k4 shown in FIG. 2 3D, the interval 

between cells 1 and 2 and that between cells 3 and 4 
does not correspond to separation points between 
characters . 

Each of the units that appear in FIG. 23A to 
10 FIG. 2 ID is applied to character recognition, and the 

character recognition result is used for calculating 
the posteriori probabilities of the categories shown in 
FIG. 23A to FIG. 23D. Although characteristics used 
for calculation (= character recognition result) are 
15 various, an example using characters specified as a 

first candidate is shown below. 

FIG. 24 shows the recognition result of each unit. 
For example, this figure shows that a first place of 
the recognition result has been R in a unit having 
2 0 cells 1 and 2 connected to each other. 

Although it is considered that character spacing 
characteristics are various, an example described in 
subsection 5.2.2 is summarized here, and the following 
is used. 

25 • Set of character spacing characteristics D' = 

{d- 1, d- 2} 

d' 1: Character spacing 
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d' 2: No character spacing 

FIG. 2 7 shows characteristics of character spacing 
between cells 1 and 2, and between cells 3 and 4. 
Character spacing is provided between cells 1 and 2, 
5 and no character spacing is provided between cells 3 

and 4 . 

When approximation described in subsection 7.3.1 
is used, in accordance with the above formula 58, 
a change P (kl | rc) / P (kl) of a probability of 
10 generating category "kl" (BAYGE), the change caused by 

knowing the recognition result shown in FIG. 24, is 
obtained by the following formula. 

P(ki|rc) _ P( " B" I" B") P( " A" I" A") 
P(ki) ~ P("B") P("A") 

(59) 

P( " A" I" Y") P( " G" I" G") P( " E" j" E") 
P( " A" ) P( " G" ) P( " E" ) 

15 In the above formula 58, a change P (ki | rs) / P (ki) 

caused by knowing characteristics of character spacing 

shown in FIG. 25 is obtained by the following formula. 

P(ki|rs) ^ P(di|si) P(d2|si) 
P(ki) ~ P(di) P(d^) 

2 0 In order to make a calculation using the above 

formula 59, when approximation described in subsections 

3.2.2 and 4.2.2 is used, for example, when p = 0.5 and 

n (E) = 26, q = 0.02. Thus, the above formula 59 is 

used for calculation as follows. 

25 

P(klkc) 5 

■ » P • p ■ q • p • p • n(Er « 14900 (61) 
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In order to make calculation using the above formula 
60, it is required to establish the following formula 
in advance. 

P(dk|si)k^l^2 1 = 1,2 andP(dk)k=i^2 

As an example, it is assumed that the following 
values shown in tables 4 and 5 are obtained. 



Table 5 
Values of P(dk' | Sj) 





1 : Character 
spacing 
(di' ) 


2 : No character 
spacing 
(d2' ) 


1 : Character 
break (s^) 


P(di' |si) 
0.90 


P(d2' |si) 
0.10 


2 : No character 
break (S2) 


P(di' |S2) 
0.02 


P(d2' |S2) 
0.98 



Table 4 lists values obtained by formula. 
5 P(dk o si) 

Table 5 lists values of P (d'k | si). In this case, 
note that a relationship shown by the following formula 
is met. 

1 0 P<dk o si ) = P(dk |si )p(si ) 

In reality, P (d'k | sl)/P (d'k) is required for 
calculation using the above formula 60. Thus, Table 6 
lists the thus calculated values. 

Table 6 





1 : Character 
spacing 
(di') 


2 : No character 
spacing 
(d2' ) 


1: Character 
break (s^) 


P(di' |si) 
1.96 


P(d2' |si) 
0. 19 


2 : No character 
break {S2) 


P(di' |S2) 
0.043 


P(d2' |S2) 
1.18 



The above formula 60 is used for calculation as 



follows, based on the above values shown in Table 6. 



^ « 1.96 • 0.19 « 0.372 (62) 

P(ki) 
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From the above formula 60, a change P (kl | r) / P (kl) 
caused by knowing the character recognition result 
shown in FIG. 24 and the characteristics of character 
spacing shown in FIG. 25 is represented by a product 
between the above formulas 61 and 62, and the following 
result is obtained. 

P(ki|r) 

« 14900 • 0.372 ^ 5543 (63) 

P{ki) 

Similarly, with respect to k2 to k4 as well, when 

P (ki I rc) / P (ki), P (ki | rs ) / P (ki), and 

P (ki I r) / P (ki) are obtained, the following result 

is obtained. 

P(k2|rc) ^ 
P(k2) 

P(k3|rc) ^ 
P(k3) 

P(k4|rc) ^ 
P(k4) 

P(]C2|rs) 
P(k2) 

P(k3|rs) 
P(k3) 

P<k4|rs) 
P(k4) 



(64) 



= 1.96 • 1.81 » 3.55 



« 0.043 • 0.19 » 0.00817 (65) 



« 0.043 ■ 1.81 « 0.0778 
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P(k2lr) 

^' « 45.7 • 3.55 « 162 
P(k2) 

P(k3|r) 

« 28600 • 0.00817 « 249 (66) 

P{k3) 

^^^^ « 2197- 0.0778 « 171 
P(k4) 

In comparing these results, although it is assumed that 
values of P (ki) are equal to each other in chapters 1 
5 to 5, the shape of characters is considered in this 

section. 

In FIG. 2 ID, the widths of units are the most 
uniform. in FIG. 21A, these widths are the second 
uniform. However, in FIG. 21B and FIG. 21C, these 
10 widths are not uniform. 

A degree of this uniformity is modeled by a 
certain method, and the modeled degree is reflected in 
P (ki), thereby enabling more precise word recognition. 
As long as such precise word recognition is achieved, 
15 any method may be used here. 

In this example, it is assumed that the following 
result is obtained. 

P(ki) : P(k2) : P(k3) : P(k4) = 2 : 1 : 1 : 10 (67) 
20 When a proportion content Pi is defined, and the above 

formula 67 is deformed by using the formulas 63 and 66, 
the following result is obtained. 
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P(ki|r) « 5543 • 2?^ « IIO86P1 
P(k2|r) « 162 • Pi « 162Pi 

(68) 

P(k3|r) « 24 9 • Pi « 24 9Pi 

P(k4|r) « 171 • lOPi « I7IOP1 
From the foregoing,, it is assumed that the highest 
posteriori probability is category "ki", and a city 
name is BAYGE . 

5 As the result of character recognition shown in 

FIG. 24, the highest priority is category k3 caused 
by the above formulas 61 and 64. As the result of 
character spacing characteristics shown in FIG. 25, the 
highest priority is category k2 caused by the above 

10 formulas 62 and 65. Although the highest value in 

evaluation of balance in character shape is category 
k4, estimation based on all integrated results is 
performed, whereby category kl can be selected. 
In this manner, according to the fourth 

15 embodiment, an input character string corresponding 

to a word to be recognized is delimited for each 
character; plural kinds of delimiting results are 
obtained considering character spacing by this 
character delimiting; recognition processing is 

2 0 performed for each of the characters specified as all 

of the obtained delimiting results; and a probability 
at which there appears characteristics obtained as 
the result of character recognition by conditioning 
characteristics of the characters and character spacing 

2 5 of the words contained in a word dictionary that 
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stores candidates of the characteristics of a word 
to be recognized and character spacing of the word. 
In addition, the thus obtained probability is divided 
by a probability at which there appears characteristics 
5 obtained as the result of character recognition; each 

of the above calculation results obtained for each of 
the characteristics of the characters and character 
spacing of the words contained in the word dictionary 
is multiplied relevant to all the characters and 
10 character spacing, and the recognition result of the 

above word is obtained based on the multiplication 
result. 

That is, in word recognition using the character 
recognition result, an evaluation function based on the 
15 posteriori probability is used in consideration of at 

least the ambiguity of character delimiting. In this 
manner, even in the case where character delimiting is 
not reliable, word recognition can be performed 
precisely. 

2 0 Additional advantages and modifications will 

readily occur to those skilled in the art. Therefore, 
the invention in its broader aspects is not limited to 
the specific details and representative embodiments 
shown and described herein. Accordingly, various 

25 modifications may be made without departing from the 

spirit or scope of the general inventive concept as 
defined by the appended claims and their equivalents. 



